Temporal Word Analogies: Identifying Lexical Replacement with Diachronic Word Embeddings

نویسنده

  • Terrence Szymanski
چکیده

This paper introduces the concept of temporal word analogies: pairs of words which occupy the same semantic space at different points in time. One well-known property of word embeddings is that they are able to effectively model traditional word analogies (“word w1 is to word w2 as word w3 is to word w4”) through vector addition. Here, I show that temporal word analogies (“wordw1 at time tα is like word w2 at time tβ”) can effectively be modeled with diachronic word embeddings, provided that the independent embedding spaces from each time period are appropriately transformed into a common vector space. When applied to a diachronic corpus of news articles, this method is able to identify temporal word analogies such as “Ronald Reagan in 1987 is like Bill Clinton in 1997”, or “Walkman in 1987 is like iPod in 2007”.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Semantic Factors Predict the Rate of Lexical Replacement of Content Words

The rate of lexical replacement estimates the diachronic stability of word forms on the basis of how frequently a proto-language word is replaced or retained in its daughter languages. Lexical replacement rate has been shown to be highly related to word class and word frequency. In this paper, we argue that content words and function words behave differently with respect to lexical replacement ...

متن کامل

Situating Word Senses in their Historical Context with Linked Data

In this article we present a Semantic Web-based model for creating lexical resources in which the diachronic and, more broadly, contextual dimensions of word meaning can be explicitly represented as part of a graph-based data structure. We start by discussing why Linked Data is the right publishing approach for such diachronic datasets. We then describe our model, lemonEty, which utilizes the o...

متن کامل

The Diachronic Change of German Nominalization Patterns: An Increase in Prototypicality

This paper aims at accounting for the emergence and loss of constraints governing the formation of deverbal nominalizations in German from a cognitive point of view. Specifically, diachronic changes in the formation of derivatives in the suffix -ung are investigated on the basis of two large corpora of Middle High German (MHG, 1050-1350) and Early New High German (ENHG, 1350-1650) texts, respec...

متن کامل

Word Embeddings, Analogies, and Machine Learning: Beyond king - man + woman = queen

Solving word analogies became one of the most popular benchmarks for word embeddings on the assumption that linear relations between word pairs (such as king:man :: woman:queen) are indicative of the quality of the embedding. We question this assumption by showing that the information not detected by linear offset may still be recoverable by a more sophisticated search method, and thus is actua...

متن کامل

Portuguese Word Embeddings: Evaluating on Word Analogies and Natural Language Tasks

Word embeddings have been found to provide meaningful representations for words in an efficient way; therefore, they have become common in Natural Language Processing systems. In this paper, we evaluated different word embedding models trained on a large Portuguese corpus, including both Brazilian and European variants. We trained 31 word embedding models using FastText, GloVe, Wang2Vec and Wor...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017